智能论文笔记

Multi-NeuS: 3D Head Portraits from Single Image with Neural Implicit Functions

Egor Burkov , Ruslan Rakhimov , Aleksandr Safin , Evgeny Burnaev , Victor Lempitsky

分类：计算机视觉

2022-09-07

我们提出了一种从一个或几种视图中重建人头的纹理3D网眼的方法。由于如此少的重建缺乏约束，因此需要先验知识，这很难强加于传统的3D重建算法。在这项工作中，我们依靠最近引入的3D表示$ \ unicode {x2013} $ neural隐式函数$ \ unicode {x2013} $，它基于神经网络，允许自然地从数据中学习有关人类头的先验，并且直接转换为纹理网格。也就是说，我们扩展了Neus（一种最新的神经隐式函数公式），以同时代表类的多个对象（在我们的情况下）。潜在的神经网架构旨在学习这些物体之间的共同点，并概括地看不见。我们的模型仅在一百个智能手机视频上进行培训，不需要任何扫描的3D数据。之后，该模型可以以良好的效果以几种镜头或一次性模式适合新颖的头。

translated by 谷歌翻译

Unpaired Depth Super-Resolution in the Wild

Aleksandr Safin , Maxim Kan , Nikita Drobyshev , Oleg Voynov , Alexey Artemov , Alexander Filippov , Denis Zorin , Evgeny Burnaev

分类：计算机视觉

2021-05-25

用商品传感器捕获的深度图通常具有低质量和分辨率；这些地图需要增强以在许多应用中使用。深度图超分辨率的最新数据驱动方法依赖于同一场景的低分辨率和高分辨率深度图的注册对。采集现实世界配对数据需要专门的设置。另一个替代方法是通过亚采样，添加噪声和其他人工降解方法从高分辨率地图中生成低分辨率地图，并不能完全捕获现实世界中低分辨率图像的特征。结果，对这种人造配对数据训练的监督学习方法可能在现实世界中的低分辨率输入上表现不佳。我们考虑了一种基于从未配对数据学习的深度超分辨率的方法。尽管已经提出了许多用于未配对图像到图像翻译的技术，但大多数技术无法使用深度图提供有效的孔填充或重建精确表面。我们提出了一种未配对的学习方法，用于深度超分辨率，该方法基于可学习的降解模型，增强成分和表面正常估计作为特征，以产生更准确的深度图。我们为未配对的深度SR提出了一个基准测试，并证明我们的方法的表现优于现有的未配对方法，并与配对相当。

translated by 谷歌翻译

Reinforcement Learning with Success Induced Task Prioritization

Maria Nesterova , Alexey Skrynnik , Aleksandr Panov

分类：机器学习 | 人工智能

2022-12-30

Many challenging reinforcement learning (RL) problems require designing a distribution of tasks that can be applied to train effective policies. This distribution of tasks can be specified by the curriculum. A curriculum is meant to improve the results of learning and accelerate it. We introduce Success Induced Task Prioritization (SITP), a framework for automatic curriculum learning, where a task sequence is created based on the success rate of each task. In this setting, each task is an algorithmically created environment instance with a unique configuration. The algorithm selects the order of tasks that provide the fastest learning for agents. The probability of selecting any of the tasks for the next stage of learning is determined by evaluating its performance score in previous stages. Experiments were carried out in the Partially Observable Grid Environment for Multiple Agents (POGEMA) and Procgen benchmark. We demonstrate that SITP matches or surpasses the results of other curriculum design methods. Our method can be implemented with handful of minor modifications to any standard RL framework and provides useful prioritization with minimal computational overhead.

translated by 谷歌翻译

HPointLoc: Point-based Indoor Place Recognition using Synthetic RGB-D Images

Dmitry Yudin , Yaroslav Solomentsev , Ruslan Musaev , Aleksei Staroverov , Aleksandr I. Panov

分类：计算机视觉 | 人工智能

2022-12-30

We present a novel dataset named as HPointLoc, specially designed for exploring capabilities of visual place recognition in indoor environment and loop detection in simultaneous localization and mapping. The loop detection sub-task is especially relevant when a robot with an on-board RGB-D camera can drive past the same place (``Point") at different angles. The dataset is based on the popular Habitat simulator, in which it is possible to generate photorealistic indoor scenes using both own sensor data and open datasets, such as Matterport3D. To study the main stages of solving the place recognition problem on the HPointLoc dataset, we proposed a new modular approach named as PNTR. It first performs an image retrieval with the Patch-NetVLAD method, then extracts keypoints and matches them using R2D2, LoFTR or SuperPoint with SuperGlue, and finally performs a camera pose optimization step with TEASER++. Such a solution to the place recognition problem has not been previously studied in existing publications. The PNTR approach has shown the best quality metrics on the HPointLoc dataset and has a high potential for real use in localization systems for unmanned vehicles. The proposed dataset and framework are publicly available: https://github.com/metra4ok/HPointLoc.

translated by 谷歌翻译

Policy Optimization to Learn Adaptive Motion Primitives in Path Planning with Dynamic Obstacles

Brian Angulo , Aleksandr Panov , Konstantin Yakovlev

分类：机器人

2022-12-29

This paper addresses the kinodynamic motion planning for non-holonomic robots in dynamic environments with both static and dynamic obstacles -- a challenging problem that lacks a universal solution yet. One of the promising approaches to solve it is decomposing the problem into the smaller sub problems and combining the local solutions into the global one. The crux of any planning method for non-holonomic robots is the generation of motion primitives that generates solutions to local planning sub-problems. In this work we introduce a novel learnable steering function (policy), which takes into account kinodynamic constraints of the robot and both static and dynamic obstacles. This policy is efficiently trained via the policy optimization. Empirically, we show that our steering function generalizes well to unseen problems. We then plug in the trained policy into the sampling-based and lattice-based planners, and evaluate the resultant POLAMP algorithm (Policy Optimization that Learns Adaptive Motion Primitives) in a range of challenging setups that involve a car-like robot operating in the obstacle-rich parking-lot environments. We show that POLAMP is able to plan collision-free kinodynamic trajectories with success rates higher than 92%, when 50 simultaneously moving obstacles populate the environment showing better performance than the state-of-the-art competitors.

translated by 谷歌翻译

TransPath: Learning Heuristics For Grid-Based Pathfinding via Transformers

Daniil Kirilenko , Anton Andreychuk , Aleksandr Panov , Konstantin Yakovlev

分类：人工智能 | 机器学习

2022-12-22

Heuristic search algorithms, e.g. A*, are the commonly used tools for pathfinding on grids, i.e. graphs of regular structure that are widely employed to represent environments in robotics, video games etc. Instance-independent heuristics for grid graphs, e.g. Manhattan distance, do not take the obstacles into account and, thus, the search led by such heuristics performs poorly in the obstacle-rich environments. To this end, we suggest learning the instance-dependent heuristic proxies that are supposed to notably increase the efficiency of the search. The first heuristic proxy we suggest to learn is the correction factor, i.e. the ratio between the instance independent cost-to-go estimate and the perfect one (computed offline at the training phase). Unlike learning the absolute values of the cost-to-go heuristic function, which was known before, when learning the correction factor the knowledge of the instance-independent heuristic is utilized. The second heuristic proxy is the path probability, which indicates how likely the grid cell is lying on the shortest path. This heuristic can be utilized in the Focal Search framework as the secondary heuristic, allowing us to preserve the guarantees on the bounded sub-optimality of the solution. We learn both suggested heuristics in a supervised fashion with the state-of-the-art neural networks containing attention blocks (transformers). We conduct a thorough empirical evaluation on a comprehensive dataset of planning tasks, showing that the suggested techniques i) reduce the computational effort of the A* up to a factor of $4$x while producing the solutions, which costs exceed the costs of the optimal solutions by less than $0.3$% on average; ii) outperform the competitors, which include the conventional techniques from the heuristic search, i.e. weighted A*, as well as the state-of-the-art learnable planners.

translated by 谷歌翻译

Fast Entropy-Based Methods of Word-Level Confidence Estimation for End-To-End Automatic Speech Recognition

Aleksandr Laptev , Boris Ginsburg

分类：自然语言处理 | 机器学习

2022-12-16

This paper presents a class of new fast non-trainable entropy-based confidence estimation methods for automatic speech recognition. We show how per-frame entropy values can be normalized and aggregated to obtain a confidence measure per unit and per word for Connectionist Temporal Classification (CTC) and Recurrent Neural Network Transducer (RNN-T) models. Proposed methods have similar computational complexity to the traditional method based on the maximum per-frame probability, but they are more adjustable, have a wider effective threshold range, and better push apart the confidence distributions of correct and incorrect words. We evaluate the proposed confidence measures on LibriSpeech test sets, and show that they are up to 2 and 4 times better than confidence estimation based on the maximum per-frame probability at detecting incorrect words for Conformer-CTC and Conformer-RNN-T models, respectively.

translated by 谷歌翻译

Sequential Kernelized Independence Testing

Aleksandr Podkopaev , Patrick Blöbaum , Shiva Prasad Kasiviswanathan , Aaditya Ramdas

分类： (统计)机器学习 | 机器学习

2022-12-14

Independence testing is a fundamental and classical statistical problem that has been extensively studied in the batch setting when one fixes the sample size before collecting data. However, practitioners often prefer procedures that adapt to the complexity of a problem at hand instead of setting sample size in advance. Ideally, such procedures should (a) allow stopping earlier on easy tasks (and later on harder tasks), hence making better use of available resources, and (b) continuously monitor the data and efficiently incorporate statistical evidence after collecting new data, while controlling the false alarm rate. It is well known that classical batch tests are not tailored for streaming data settings, since valid inference after data peeking requires correcting for multiple testing, but such corrections generally result in low power. In this paper, we design sequential kernelized independence tests (SKITs) that overcome such shortcomings based on the principle of testing by betting. We exemplify our broad framework using bets inspired by kernelized dependence measures such as the Hilbert-Schmidt independence criterion (HSIC) and the constrained-covariance criterion (COCO). Importantly, we also generalize the framework to non-i.i.d. time-varying settings, for which there exist no batch tests. We demonstrate the power of our approaches on both simulated and real data.

translated by 谷歌翻译

Support Vector Machine for Determining Euler Angles in an Inertial Navigation System

Aleksandr N. Grekov , Aleksei A. Kabanov , Sergei Yu. Alekseev

分类：机器人 | 人工智能

2022-12-07

The paper discusses the improvement of the accuracy of an inertial navigation system created on the basis of MEMS sensors using machine learning (ML) methods. As input data for the classifier, we used infor-mation obtained from a developed laboratory setup with MEMS sensors on a sealed platform with the ability to adjust its tilt angles. To assess the effectiveness of the models, test curves were constructed with different values of the parameters of these models for each core in the case of a linear, polynomial radial basis function. The inverse regularization parameter was used as a parameter. The proposed algorithm based on MO has demonstrated its ability to correctly classify in the presence of noise typical for MEMS sensors, where good classification results were obtained when choosing the optimal values of hyperpa-rameters.

translated by 谷歌翻译

Construction of Object Boundaries for the Autopilotof a Surface Robot from Satellite Imagesusing Computer Vision Methods

Aleksandr N. Grekov , Yurii E. Shishkin , Sergei S. Peliushenko , Aleksandr S. Mavrin

分类：计算机视觉

2022-12-05

An algorithm and a program for detecting the boundaries of water bodies for the autopilot module of asurface robot are proposed. A method for detecting water objects on satellite maps by the method of finding a color in the HSV color space, using erosion, dilation - methods of digital image filtering is applied.The following operators for constructing contours on the image are investigated: the operators of Sobel,Roberts, Prewitt, and from them the one that detects the boundary more accurately is selected for thismodule. An algorithm for calculating the GPS coordinates of the contours is created. The proposed algorithm allows saving the result in a format suitable for the surface robot autopilot module.

translated by 谷歌翻译